AITopics

Country: Asia > China (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Neural Information Processing SystemsDec-23-2025, 18:13:16 GMT

Understanding Deep Architecture with Reasoning Layer

Recently, there is a surge of interest in combining deep learning models with reasoning in order to handle more sophisticated learning tasks. In many cases, a reasoning task can be solved by an iterative algorithm. This algorithm is often unrolled, truncated, and used as a specialized layer in the deep architecture, which can be trained end-to-end with other neural components. Although such hybrid deep architectures have led to many empirical successes, theoretical understandings of such architectures, especially the interplay between algorithm layers and other neural layers, remains largely unexplored. In this paper, we take an initial step toward an understanding of such hybrid deep architectures by showing that properties of the algorithm layers, such as convergence, stability and sensitivity, are intimately related to the approximation and generalization abilities of the end-to-end model. Furthermore, our analysis matches nicely with experimental observations under various conditions, suggesting that our theory can provide useful guidelines for designing deep architectures with reasoning layers.

architecture, deep architecture, name change, (5 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Koubaa, Anis, Gabr, Khaled

Agentic UAVs: LLM-Driven Autonomy with Integrated Tool-Calling and Cognitive Reasoning

arXiv.org Artificial IntelligenceDec-3-2025

Unmanned Aerial Vehicles (UAVs) are increasingly used in defense, surveillance, and disaster response, yet most systems still operate at SAE Level 2 to 3 autonomy. Their dependence on rule-based control and narrow AI limits adaptability in dynamic and uncertain missions. Current UAV architectures lack context-aware reasoning, autonomous decision-making, and integration with external systems. Importantly, none make use of Large Language Model (LLM) agents with tool-calling for real-time knowledge access. This paper introduces the Agentic UAVs framework, a five-layer architecture consisting of Perception, Reasoning, Action, Integration, and Learning. The framework enhances UAV autonomy through LLM-driven reasoning, database querying, and interaction with third-party systems. A prototype built with ROS 2 and Gazebo combines YOLOv11 for object detection with GPT-4 for reasoning and a locally deployed Gemma 3 model. In simulated search-and-rescue scenarios, agentic UAVs achieved higher detection confidence (0.79 compared to 0.72), improved person detection rates (91% compared to 75%), and a major increase in correct action recommendations (92% compared to 4.5%). These results show that modest computational overhead can enable significantly higher levels of autonomy and system-level integration.

large language model, machine learning, natural language, (19 more...)

2509.13352

Country: Asia > Middle East > Saudi Arabia (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.94)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Vaghasiya, Jay, Ghugarkar, Omkar, Bhat, Vishvesh, Dholaria, Vipul, McAuley, Julian

CoreThink: A Symbolic Reasoning Layer to reason over Long Horizon Tasks with LLMs

arXiv.org Artificial IntelligenceSep-4-2025

We introduce CoreThink, a state-of-the-art Reasoning Layer built upon a novel reasoning method called General Symbolics. This approach diverges from reasoning paradigms such as test-time scaling, Supervised Fine-Tuning (SFT), and Reinforcement Learning with Verifiable Rewards (RLVR). CoreThink General Symbolic Reasoner (GSR) is specifically structured around three key use cases: tool-calling, code generation, and planning, demonstrating exemplary performance across a total of seven benchmarks in their respective areas. Notably, we are achieving SOTA scores of 66.66% on Livecodebench v6, 89% on Instruction-Following Evals, and 24.4% on ARC-AGI-2. We also present an agentic coding IDE, developed using the principles of General Symbolics, which achieves a state-of-the-art accuracy of 62.3% on SWE-Bench Lite. We are able to achieve these improvements without any fine-tuning or training costs. Our Reasoning Layer is designed to provide a pure performance uplift, ensuring that a model's accuracy on reasoning tasks is never negatively impacted. We argue that incumbent methods will eventually lead to diminishing returns in LLM performance, necessitating the development of new reasoning techniques. This technical report details our approach at a high level and the availability of the CoreThink models for reasoning-intensive use cases.

large language model, machine learning, natural language, (20 more...)

2509.00971

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Law (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Thakrar, Karishma, Basavatia, Shreyas, Daftardar, Akshay

Architecting Clinical Collaboration: Multi-Agent Reasoning Systems for Multimodal Medical VQA

arXiv.org Artificial IntelligenceAug-27-2025

--Dermatological care via telemedicine often lacks the rich context of in-person visits. Clinicians must make diagnoses based on a handful of images and brief descriptions, without the benefit of physical exams, second opinions, or reference materials. While many medical AI systems attempt to bridge these gaps with domain-specific fine-tuning, this work hypothesized that mimicking clinical reasoning processes could offer a more effective path forward. This study tested seven vision-language models on medical visual question answering across six configurations: baseline models, fine-tuned variants, and both augmented with either reasoning layers that combine multiple model perspectives, analogous to peer consultation, or retrieval-augmented generation that incorporates medical literature at inference time, serving a role similar to reference-checking. While fine-tuning degraded performance in four of seven models with an average 30% decrease, baseline models collapsed on test data. Clinical-inspired architectures, meanwhile, achieved up to 70% accuracy, maintaining performance on unseen data while generating explainable, literature-grounded outputs critical for clinical adoption. These findings demonstrate that medical AI succeeds by reconstructing the collaborative and evidence-based practices fundamental to clinical diagnosis. Fine-tuning large models on medical data, the standard approach to medical AI, assumes domain exposure produces clinical competence [1]. Y et dermatology models show 15% performance drops in real-world settings [2], and catastrophic forgetting causes models to generate outputs exclusively from their training data [3]. This brittleness suggests a fundamental mismatch between current approaches and clinical reasoning. Additionally, physician groups achieve 85.6% diagnostic accuracy versus 62.5% for individuals [4], as collaboration reduces cognitive load and bias [5]. However, logistical constraints force physicians to work alone, a problem telemedicine intensifies by eliminating physical exams, peer consultation, and immediate reference access [6].

large language model, machine learning, natural language, (21 more...)

2507.0552

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Health Care Technology > Telehealth (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Dermatology (0.89)
Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)
(2 more...)

Neural Information Processing SystemsFeb-11-2025, 21:03:16 GMT

Review for NeurIPS paper: Understanding Deep Architecture with Reasoning Layer

The analysis connects underlying algorithm property and the performance of the deep learning models. In the learning theory analysis, the local Rademacher complexity technique is utilized to obtain tighter bound, which enables to reveal trade-off corresponding to the number of layers. The theoretical findings are justified from numerical experiments. This paper deals with a new problem setting and gives a nice first step. Although its problem setting is quite simple, it is expected that this kind of study will open up a new direction of researches.

deep architecture, neurips paper, reasoning layer, (1 more...)

Genre: Overview (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsJan-21-2025, 13:03:07 GMT

Review for NeurIPS paper: Understanding Deep Architecture with Reasoning Layer

Strengths: The most interesting aspect of this paper is the abstraction with which such a large class of potential hybrid models is dealt with. The problem setting is general enough that it's not easy to come up with architectures that would not fit this scheme. While the results part of the paper starts by revisiting some well known results on convergence of gradient descent and Nesterov's method, the study of the sensitivity to perturbations of the two algorithms seems novel. The main interesting results come in Sections 4 and 5, where the authors present first results showing that faster convergence leads to eventual better approximation. I have found the Theorem 5.1 and the corresponding theorems in the Supp.

deep architecture, neurips paper, reasoning layer, (3 more...)

Genre: Research Report > New Finding (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Ashfaq, Muhammad, Sadik, Ahmed R., Mikkonen, Tommi, Waseem, Muhammad, Mäkitalo, Niko

LLM-Ehnanced Holonic Architecture for Ad-Hoc Scalable SoS

arXiv.org Artificial IntelligenceJan-14-2025

As modern system of systems (SoS) become increasingly adaptive and human-centred, traditional architectures often struggle to support interoperability, reconfigurability, and effective human-system interaction. This paper addresses these challenges by advancing the stateof-the-art holonic architecture for SoS, offering two main contributions to support these adaptive needs. First, we propose a layered architecture for holons, which includes reasoning, communication, and capabilities layers. This design facilitates seamless interoperability among heterogeneous constituent systems by improving data exchange and integration. Second, inspired by principles of intelligent manufacturing, we introduce specialised holons-namely, supervisor, planner, task, and resource holons-aimed at enhancing the adaptability and reconfigurability of SoS. These specialised holons utilise large language models within their reasoning layers to support decision-making and ensure real-time adaptability. We demonstrate our approach through a 3D mobility case study focused on smart city transportation, showcasing its potential for managing complex, multimodal SoS environments. Additionally, we propose evaluation methods to assess the architecture's efficiency and scalability, laying the groundwork for future empirical validations through simulations and real-world implementations.

large language model, machine learning, natural language, (19 more...)

2501.07992

Country: Europe > Finland (0.14)

Genre: Research Report (0.82)

Industry:

Transportation (1.00)
Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Neural Information Processing SystemsOct-9-2024, 12:46:46 GMT

Understanding Deep Architecture with Reasoning Layer

architecture, deep architecture, reasoning layer, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceApr-8-2024

Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models

Ouyang, Yutao, Li, Jinhan, Li, Yunfei, Li, Zhongyu, Yu, Chao, Sreenath, Koushil, Wu, Yi

We present a large language model (LLM) based system to empower quadrupedal robots with problem-solving abilities for long-horizon tasks beyond short-term motions. Long-horizon tasks for quadrupeds are challenging since they require both a high-level understanding of the semantics of the problem for task planning and a broad range of locomotion and manipulation skills to interact with the environment. Our system builds a high-level reasoning layer with large language models, which generates hybrid discrete-continuous plans as robot code from task descriptions. It comprises multiple LLM agents: a semantic planner for sketching a plan, a parameter calculator for predicting arguments in the plan, and a code generator to convert the plan into executable robot code. At the low level, we adopt reinforcement learning to train a set of motion planning and control skills to unleash the flexibility of quadrupeds for rich environment interactions. Our system is tested on long-horizon tasks that are infeasible to complete with one single skill. Simulation and real-world experiments show that it successfully figures out multi-step strategies and demonstrates non-trivial behaviors, including building tools or notifying a human for help.

locomotion, motion planning, robot, (14 more...)

2404.05291

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)